Fix model-downloader and tgi in multi shard case #642

lianhao · 2024-12-12T08:51:52Z

Description

Upgrade huggingface-hub to version 0.26.5 when downloading models, due to the existing huggingface/downloader:0.17.3 image doesn't acknowledge the HF_TOKEN correctly.

Loose the tgi securityContext to allow running with multi shard.

Issues

Fixes #641
Fixes #639

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)

Dependencies

List the newly introduced 3rd party dependency if exists.

Tests

Describe the tests that you ran to verify your changes.

helm-charts/common/speecht5/templates/deployment.yaml

helm-charts/common/tgi/values.yaml

helm-charts/common/speecht5/templates/deployment.yaml

eero-t

Approved.

Cache being in emptyDir i.e. going away when the pod instance goes away (instead of being shared like model data), is more secure, but it can be significant pod startup performance issue. Especially with HPA and in other setups were pods come and go. That can be looked in another PR though.

I think in long term it would be better to separate model downloading and running of the application services. That would allow model downloading to be centralized to single service / container, instead of split over multiple pods, and even to do it as separate step before starting any of the application pods.

Ruoyu-y

LGTM.

helm-charts/common/tei/templates/deployment.yaml

Signed-off-by: Lianhao Lu <[email protected]>

Fix issue opea-project#639 Signed-off-by: Lianhao Lu <[email protected]>

lianhao requested a review from yongfengdu as a code owner December 12, 2024 08:51

lianhao requested a review from Ruoyu-y December 12, 2024 08:52

lianhao force-pushed the bug641 branch from 2e98034 to a61a25f Compare December 12, 2024 11:13

lianhao changed the title ~~Use huggingface-hub 0.26.5 to download model~~ Fix model-downloader and tgi in multi shard case Dec 12, 2024

lianhao force-pushed the bug641 branch from a61a25f to 9009207 Compare December 12, 2024 11:16

eero-t reviewed Dec 12, 2024

View reviewed changes

helm-charts/common/speecht5/templates/deployment.yaml Show resolved Hide resolved

helm-charts/common/tgi/values.yaml Outdated Show resolved Hide resolved

lianhao force-pushed the bug641 branch from 9009207 to 22f4649 Compare December 13, 2024 03:30

eero-t reviewed Dec 13, 2024

View reviewed changes

helm-charts/common/speecht5/templates/deployment.yaml Outdated Show resolved Hide resolved

lianhao force-pushed the bug641 branch from 22f4649 to c12565a Compare December 16, 2024 06:36

lianhao mentioned this pull request Dec 16, 2024

Adapt to latest vllm changes #632

Open

1 task

eero-t approved these changes Dec 16, 2024

View reviewed changes

yongfengdu approved these changes Dec 17, 2024

View reviewed changes

Ruoyu-y approved these changes Dec 17, 2024

View reviewed changes

Ruoyu-y reviewed Dec 17, 2024

View reviewed changes

helm-charts/common/tei/templates/deployment.yaml Outdated Show resolved Hide resolved

lianhao added 2 commits December 17, 2024 05:37

Workaround to acknowledge HF_TOKEN in model-downloader

d1cccf8

Signed-off-by: Lianhao Lu <[email protected]>

tgi: Fix permission issue of non-root user

d0b4b47

Fix issue opea-project#639 Signed-off-by: Lianhao Lu <[email protected]>

lianhao force-pushed the bug641 branch from c12565a to d0b4b47 Compare December 17, 2024 05:38

lianhao merged commit a4a96ab into opea-project:main Dec 17, 2024
18 checks passed

lianhao deleted the bug641 branch December 17, 2024 06:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix model-downloader and tgi in multi shard case #642

Fix model-downloader and tgi in multi shard case #642

lianhao commented Dec 12, 2024 •

edited

Loading

eero-t left a comment

Ruoyu-y left a comment

Fix model-downloader and tgi in multi shard case #642

Fix model-downloader and tgi in multi shard case #642

Conversation

lianhao commented Dec 12, 2024 • edited Loading

Description

Issues

Type of change

Dependencies

Tests

eero-t left a comment

Choose a reason for hiding this comment

Ruoyu-y left a comment

Choose a reason for hiding this comment

lianhao commented Dec 12, 2024 •

edited

Loading